Resampling Technique for Imbalanced Class Handling on Educational Dataset

نویسندگان

چکیده

Educational data mining is an emerging field in mining. The need for accurate identifying student accomplishment on a course or maybe upcoming can help the institution to build technology-aided education better. becoming more important be studied because of its potential produce knowledge base model even teacher lecturer. Like another classification task, educational has common and frequently discovered problem. problem that occurred specifically tasks generally imbalanced class An condition where distribution each not same proportion. In this research, it found severely multiclass dataset consists than two labels. According stated beforehand, paper will focus handling with several methods both such as Linear Regression, Random Forest Stacking SMOTE, ADASYN, SMOTE-ENN resampling algorithm. are being evaluated using 10-fold cross-validation 80-20 splitting ratio. result shows best performance coming from ADASYN resampled ratio 0.97 F1 score. study also technique improves performance. Even though no-resampling produced decent too, caused by things general pattern already been good start. Thus, there no real drawbacks if original processed.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Resampling Imbalanced Class and the Effectiveness of Feature Selection Methods for Heart Failure Dataset

Clinical datasets commonly have an imbalanced class distribution and high dimensional variables. Imbalanced class means that one class is represented by a large number (majority) of samples more than another (minority) one in binary classification [1]. For example, in our research dataset there are 1459 instances classified as “Alive” while 485 are classified as “Dead”. Machine learning is gene...

متن کامل

Resampling Imbalanced Class and the Effectiveness of Feature Selection Methods for Heart Failure Dataset

Clinical datasets commonly have an imbalanced class distribution and high dimensional variables. Imbalanced class means that one class is represented by a large number (majority) of samples more than another (minority) one in binary classification [1]. For example, in our research dataset there are 1459 instances classified as “Alive” while 485 are classified as “Dead”. Machine learning is gene...

متن کامل

Class-Boundary Alignment for Imbalanced Dataset Learning

In this paper, we propose the class-boundaryalignment algorithm to augment SVMs to deal with imbalanced training-data problems posed by many emerging applications (e.g., image retrieval, video surveillance, and gene profiling). Through a simple example, we first show that SVMs can be ineffective in determining the class boundary when the training instances of the target class are heavily outnum...

متن کامل

Safe-Level-SMOTE: Safe-Level-Synthetic Minority Over-Sampling TEchnique for Handling the Class Imbalanced Problem

The class imbalanced problem occurs in various disciplines when one of target classes has a tiny number of instances comparing to other classes. A typical classifier normally ignores or neglects to detect a minority class due to the small number of class instances. SMOTE is one of over-sampling techniques that remedies this situation. It generates minority instances within the overlapping regio...

متن کامل

Traffic Sign Recognition System for Imbalanced Dataset

In classification problem, the most important factor is training dataset which is effect accuracy rate of classification. However, we encounter with imbalanced data set in real-world applications. In this dataset, the number of images in some classes is rather less than the number of images in other classes. So estimation of classification is tent to majority class and minority classes will be ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Jurnal Informatika: Juita

سال: 2023

ISSN: ['2579-8901', '2086-9398']

DOI: https://doi.org/10.30595/juita.v11i1.15498